Signal Separation Motivated by Human Auditory Perception: Applications to Automatic Speech Recognition
نویسنده
چکیده
The human auditory system uses a number of well-identified cues to segregate and separate individual sound sources in a complex acoustical environment. For example, researchers in auditory scene analysis have long identified cues such as common onset, correlated fluctuations in instantaneous amplitude and frequency, harmonicity, and common interaural time and amplitude differences as ways of identifying which components of a complex signal are derived from a common source. It is widely believed that the use of these cues to achieve such “grouping” and signal separation should be very useful in improving the accuracy of automatic speech recognition in very difficult environments such as competing speech, background music, and transient noise, and this has been a goal of several research groups in computational auditory scene analysis. This talk describes and discusses several ways in which signals can be separated using physiologically-motivated cues, along with the potential benefit to be derived from such separation for automatic speech recognition.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملApplying physiologically-motivated models of auditory processing to automatic speech recognition
For many years the human auditory system has been an inspiration for developers of automatic speech recognition systems because of its ability to interpret speech accurately in a wide variety of difficult acoustical environments. This paper discusses the application of physiologically-motivated approaches to signal processing that facilitate robust automatic speech recognition in environments w...
متن کاملEffect of signal to noise ratio on the speech perception ability of older adults
Background: Speech perception ability depends on auditory and extra-auditory elements. The signal-to-noise ratio (SNR) is an extra-auditory element that has an effect on the ability to normally follow speech and maintain a conversation. Speech in noise perception difficulty is a common complaint of the elderly. In this study, the importance of SNR magnitude as an extra-auditory effect on speech...
متن کاملSignal Processing for Robust Speech Recognition Motivated by Auditory Processing
Although automatic speech recognition systems have dramatically improved in recent decades, speech recognition accuracy still significantly degrades in noisy environments. While many algorithms have been developed to deal with this problem, they tend to be more effective in stationary noise such as white or pink noise than in the presence of more realistic degradations such as background music,...
متن کاملAn automatic speech recognition system based on the scene analysis account of auditory perception
Despite many years of concentrated research, the performance gap between automatic speech recognition (ASR) and human speech recognition (HSR) remains large. The difference between ASR and HSR is particularly evident when considering the response to additive noise. Whereas human performance is remarkably robust, ASR systems are brittle and only operate well within the narrow range of noise cond...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993